library(dplyr)
library(INLA)
library(inlabru)
library(sf)
library(terra)
library(tidyverse)
# load some libraries to generate nice map plots
library(scico)
library(ggplot2)
library(patchwork)Practical 7
Aim of this practical:
In this first practical we are going to look multiple likelihood models
Libraries to load:
Multiple likelihood models
Simple simulated example
In this example we are going to fit a model where one covariate acts in the same way for two different responses
Gaussian observations
Poisson observations
We first define the unobserved latent field and then simulate both Gaussian and Poisson observations.
N = 200
x = runif(N)
df = data.frame(idx = 1:N,
x = x)
# simulate data
df = df %>%
mutate(y_gaus = rnorm(N, mean = 1 + 1.5 * x), sd = 0.5) %>%
mutate(y_pois = rpois(N, lambda = exp( -1 + 1.5 * x)))
# plot the data
df %>% ggplot() +
geom_point(aes(x, y_gaus, color = "Gaussian")) +
geom_point(aes(x, y_pois, color = "Poisson")) The model we aim to fit is
\[ \begin{aligned} \eta_t & = \beta_0 + \beta_1x_t,\ t = 1,\dots,T\\ y^{\text{Gaus}} &\sim\mathcal{N}(\mu_t,\sigma^2_y)\\ \eta^{\text{Gaus}}_t & = \mu_t\\ y^{\text{Pois}}_t & \sim\text{Poisson}(\lambda_t)\\ \eta^{\text{Pois}}_t & =\log(\lambda_t) \end{aligned} \]
We first define the components
cmp = ~ -1 +
Intercept_gaus(1) +
Intercept_pois(1) +
covariate(x, model = "linear") and the two likelihoods:
lik_gaus = bru_obs(formula = y_gaus ~ Intercept_gaus + covariate,
data = df)
lik_pois = bru_obs(formula = y_pois ~ Intercept_pois + covariate,
data = df,
family = "poisson")Coregionalization model
In this practical we present a way to fit the Bayesian coregionalization model similar to the one presented in Chapter 8 in Blangiardo and Cameletti (2015).
These models are often used when measurement stations record several variables; for example, a station measuring pollution may register values of CO2 and NO2. Instead of modelling these as several univariate datasets, the models we present in this section deal with the joint dependency structure. Dependencies among the different outcomes are modelled through shared components at the predictor level.
Usually, in coregionalization models, the different responses are assumed to be observed at the same locations. With the INLA-SPDE approach, we do not require the different outcome variables to be measured at the same locations. Hence, in the code example below we show how to model responses observed at different locations.
The model
We consider the following model \[ \begin{aligned} y_1(s) & = \beta_1 + \omega_1(s) + e_1(s)\\ y_2(s) & = \beta_2 + \lambda\ \omega_1(s) + \omega_2(s) +e_1(s), \end{aligned} \] where the
\(\beta_k\) are intercepts, \(k=1,2\).
\(\omega_1(s)\) is a Gaussian spatial effect that is common for both observations
\(\lambda\) is a weight for the spatial effect \(\omega_1(s)\).
\(\omega_2(s)\) is a Gaussian spatial effect specific to the second observations
\(e_k(s)\) are uncorrelated error terms, \(k=1,2\).
Data simulation
We start by simulating the data. We first define a function to sample from a Matern RF with given range and sd.
rMatern <- function(n, coords, sigma=1, range,
kappa = sqrt(8*nu)/range,
variance = sigma^2,
nu=1) {
m <- as.matrix(dist(coords))
m <- exp((1-nu)*log(2) + nu*log(kappa*m)-
lgamma(nu))*besselK(m*kappa, nu)
diag(m) <- 1
return(drop(crossprod(chol(variance*m),
matrix(rnorm(nrow(coords)*n), ncol=n))))
}We then define the true values of all parameters
# Intercept on reparametrized model
beta <- c(-5, 3)
# Random field marginal variances for omega1 and omega2:
m.var <- c(0.5, 0.4)
# GRF range parameters for omega1 and omega2:
range <- c(4, 6)
# Copy parameters: reparameterization of coregionalization
# parameters
lambda <- c(0.7)
# Standard deviations of error terms
e.sd <- c(0.3, 0.2)and simulate our data. We assume that we observe data in a window \((0:10)\times(0:5)\). We assume that in some locations we observe both \(y_1\) and \(y_2\) while in others we only observe one of them
# define the area of interest
poly_geom = st_polygon(list(cbind(c(0,10,10,0,0), c(0,0,5,5,0)) ))
# Wrap it in an sfc (simple feature collection)
poly_sfc <- st_sfc(poly_geom)
# Now create the sf object
border <- st_sf(id = 1, geometry = poly_sfc)
# how many observation we have
n1 <- 200
n2 <- 150
n_common = 50
# simulate observation locations
loc_common = st_sf(geometry = st_sample(border, n_common))
loc_only1 = st_sf(geometry = st_sample(border, n1-n_common))
loc_only2 = st_sf(geometry = st_sample(border, n2-n_common))
# simulate the two gaussian field at the locations
z1 <- rMatern(1, st_coordinates(rbind( loc_common,loc_only1, loc_only2)), range = range[1],
sigma = sqrt(m.var[1]))
z2 <- rMatern(1, st_coordinates(rbind(loc_common, loc_only2)), range = range[2],
sigma = sqrt(m.var[2]))
## Create data.frame
loc1 = rbind( loc_common, loc_only1)
loc2 = rbind( loc_common, loc_only2)
df1 = loc1 %>% mutate(z1 = z1[1:n1])
df2 = loc2 %>% mutate(z1 = z1[-c(1:(n1-n_common))], z2 =z2)
## create the linear predictors
df1 = df1 %>%
mutate(eta1 = beta[1] + z1)
df2 = df2 %>%
mutate(eta2 = beta[2] + lambda * z1 + z2)
# simulate data by addint the obervation noise
df1 = df1 %>%
mutate(y = rnorm(n1, mean = eta1, sd = e.sd[1]))
df2 = df2 %>%
mutate(y = rnorm(n2, mean = eta2, sd = e.sd[1]))We can visualize the observations
p1 = ggplot(data = df1) + geom_sf(aes(color = z1))
p2 = ggplot(data = df2) + geom_sf(aes(color = z2))
p1+p2+plot_layout(ncol = 1)Mesh definition
We need to define a mesh. We use the location observations and the area of interest as a starting point.
mesh <- fm_mesh_2d(loc = rbind(loc1, loc2),
boundary = border,
max.edge = c(0.5, 1.5),
offset = c(0.1, 2.5),
cutoff = 0.1)Here is the mesh, together with the observations \(y_1\) (red) and \(y_2\) (black).
SPDE definition
Run the model
We now need to define the components of the model:
cmp = ~ -1 + Intercept1(1) + Intercept2(1) +
omega1(geometry, model = spde) +
omega1_copy(geometry, copy = "omega1", fixed = FALSE) +
omega2(geometry, model = spde)